Search CORE

15 research outputs found

Reliability Analysis of Compressed CNNs

Author: Malek Alirad
Petersen Moura Trancoso Pedro
Ribes Stefano
Sourdis Ioannis
Publication venue
Publication date: 01/01/2021
Field of study

The use of artificial intelligence, Machine Learning and in particular Deep Learning (DL), have recently become a effective and standard de-facto solution for complex problems like image classification, sentiment analysis or natural language processing. In order to address the growing demand of performance of ML applications, research has focused on techniques for compressing the large amount of the parameters required by the Deep Neural Networks (DNN) used in DL. Some of these techniques include parameter pruning, weight-sharing, i.e. clustering of the weights, and parameter quantization. However, reducing the amount of parameters can lower the fault tolerance of DNNs, already sensitive to software and hardware faults caused by, among others, high particles strikes, row hammer or gradient descent attacks, et cetera. In this work we analyze the sensitivity to faults of widely used DNNs, in particular Convolutional Neural Networks (CNN), that have been compressed with the use of pruning, weight clustering and quantization. Our analysis shows that in DNNs that employ all such compression mechanisms, i.e. with their memory footprint reduced up to 86.3x, random single bit faults can result in accuracy drops up to 13.56%

Chalmers Research

DeSyRe: on-Demand System Reliability

Author: Armato Antonino
Bouganis Christos-Savvas
Falsafi Babak
Gaydadjiev Georgi
Isaza Sebastian
Malek Alirad
Mariani Riccardo
Pnevmatikatos Dionisios N
Pradhan Dhiraj K
Rauwerda Gerard
Seepers Robert
Shafik Rishad Ahmed
Sourdis Ioannis
Strydis Christos
Sunesen Kim
Theodoropoulos Dimitris
Tzilis Stavros
Vavouras Michail
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints

Southampton (e-Prints Soton)

EUR Research Repository

Chalmers Research

Chalmers Publication Library

Explore Bristol Research

Comparison of Psychological Distress between Type 2 Diabetes Patients with and without Proteinuria

Author: Armato A.
Bouganis C. S.
Falsafi B.
Gaydadjiev Georgi
Isaza S.
Malek Alirad
Mariani R.
Pagliarini S.N.
Pnevmatikatos Dionisios N.
Pradhan D.K.
Rauwerda G.K.
Seepers R.M.
Shafik R.A.
Smaragdos G.
Sourdis Ioannis
Strydis C.
Theodoropoulos D.
Tzilis Stavros
Vavouras M.
Publication venue: Okayama University Medical School
Publication date: 01/01/2014
Field of study

We investigated the link between proteinuria and psychological distress among patients with type 2 diabetes mellitus (T2DM). A total of 130 patients with T2DM aged 69.1±10.3 years were enrolled in this cross-sectional study. Urine and blood parameters, age, height, body weight, and medications were analyzed, and each patient’s psychological distress was measured using the six-item Kessler Psychological Distress Scale (K6). We compared the K6 scores between the patients with and without proteinuria. Forty-two patients (32.3%) had proteinuria (≥±) and the level of HbA1c was 7.5±1.3%. The K6 scores of the patients with proteinuria were significantly higher than those of the patients without proteinuria even after adjusting for age and sex. The clinical impact of proteinuria rather than age, sex and HbA1c was demonstrated by a multiple regression analysis. Proteinuria was closely associated with higher psychological distress. Preventing and improving proteinuria may reduce psychological distress in patients with T2DM

Crossref

Southampton (e-Prints Soton)

EUR Research Repository

Chalmers Research

Okayama University Scientific Achievement Repository

Explore Bristol Research

DeSyRe: On-demand system reliability

Author: Armato A.
Bouganis Christos Savvas
Falsafi Babak
Gaydadjiev Georgi Nedeltchev
Isaza Sebastian
Malek Alirad
Mariani R.
Pnevmatikatos Dionisios N.
Pradhan Dhiraj K.
Rauwerda Gerard K.
Seepers Robert M.
Shafik Rishad A.
Sourdis Ioannis
Strydis Christos
Sunesen Kim
Theodoropoulos Dimitris
Tzilis Stavros
Vavouras Michalis
Publication venue: 'Elsevier BV'
Publication date: 26/02/2014
Field of study

The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect-/fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints. (C) 2013 Elsevier B.V. All rights reserved

Infoscience - École polytechnique fédérale de Lausanne

Reconfigurable NoC and Processors Tolerant to Permanent Faults

Author: Malek Alirad
Publication venue
Publication date: 01/01/2015
Field of study

Advances in semiconductor industry have led to reduced transistor dimensions andincreased device density, but inevitably they have compromised the reliability of moderncomputing systems. In this thesis, we address the reliability problemby exploitinghardware reconfiguration for tolerating permanent faults. Processing components in asystem-on-chip are divided into smaller Substitutable Units (SUs) and reconfigurableinterconnects are used to isolate defective SUs and connect spare units to create afault-free component. Furthermore, employing fine-grain logic for instantiating a functionallyequivalent unit is another reconfiguration option considered. Based on theseapproaches, the first part of this thesis presents a probabilistic analysis of reconfigurabledesigns for calculating the average number of constructable components at differentfault densities. Considering the area overheads of reconfigurability, we evaluate theresilience of various reconfigurable designs with different granularities (SU sizes). Concisely,the results reveal that the combination of fine and coarse-grain reconfigurationoffers up to 3\ua3 more fault-tolerance compared to component redundancy. Performinga design-space exploration to find the most efficient granularity mix shows thatdifferent fault densities require different granularities of substitutable units to maximizefault-tolerance. Moreover, we explored the performance effects of pipelining thereconfigurable interconnects in adaptive processors and observed that the operatingfrequency and execution time of pipelined design is roughly 2.5\ua3 and 2\ua3 better than thedesign with non-pipelined interconnects, respectively. In the second part of this thesis,we describe RQNoC, a service-oriented Network-on-Chip (NoC) resilient to permanentfaults. We characterize the network resources based on the particular service they supportand, when faulty, bypass them allowing the respective traffic class to be redirected.We propose service merging (SMerge) and service detouring (SDetour) as the two serviceredirection schemes. Different RQNoC configurations are implemented and evaluatedin terms of performance, area, power consumption and fault tolerance. Concisely, theevaluation results show that compared to the baseline network, SMerge requires 51%more area and 27% more power and has a 9% slower clock but maintains at least 90% ofthe network connectivity even in presence of 32 permanent network faults

Chalmers Research

Advances on Adaptive Fault-Tolerant System Components: Micro-processors, NoCs, and DRAM

Author: Malek Alirad
Publication venue
Publication date: 01/01/2017
Field of study

The adverse effects of technology scaling on reliability of digital circuits have made the use of fault tolerance techniques more necessary in modern computing systems. Digital designers continuously search for efficient techniques to improve reliability, while keeping the imposed overheads low.However, unpredictable changes in the system conditions, e.g. available resources, working environment or reliability requirements, would have significant impact on the efficiency of a fault-handling mechanism. In the light of this problem, adaptive fault tolerance (AFT) techniques have emerged as a flexible and more efficient way to maintain the reliability level by adjusting to the new system conditions. Aside from this primary application of AFT techniques, this thesis suggests that adding adaptability to hardware component provides the means to have better trade-off between achieved reliability and incurred overheads. On this account, hardware adaptability is explored on three main components of a multi-core system, namely on micro-processors, Networkson-Chip (NoC) and main memories. In the first part of this thesis, a reliable micro-processor array architecture is studied which can adapt to permanent faults. The architecture supports a mix of coarse and/or fine-grain reconfiguration. To this end, the micro-processor is divided into smaller substitutable units (SUs) which are connected to each other using reconfigurable nterconnects. Then, a design-space exploration of such adaptive micro-processor array is presented to find the best trade-off between reliability and itsoverheads, considering different granularities of SUs and reconfiguration options. Briefly, the results reveal that the combination of fine and coarse-grain reconfiguration offers up to 3 more fault tolerance with the same overhead compared to simple processor level redundancy. The second part of this thesis, presents RQNoC, a service-oriented NoC that can adapt to permanent faults. Network resources are characterized based on the particular service they support and, when faulty, they can be bypassed through two options for redirection, i.e. service merging (SMerge) and/or service detouring (SDetour). While SDetour keeps lanes of different services isolated, suffering longer paths, SMerge trades service isolation for shorter paths and higher connectivity. Different RQNoC configurations are implemented and evaluated in terms of network performance, implementation results and reliability. Concisely, the evaluation results show that compared to the baseline network, SMerge maintains at least 90% of the network connectivity even in the presence of 32 permanent network faults, which is more than double versus SDetour, but will impose 51% more area, 27% more power and has a 9% slower clock. Finally, the last part of this thesis presents a fault-tolerant scheme on the DRAM memories that enables the trade-off between DRAM capacity and fault tolerance. We introduce Odd-ECC DRAM mapping, a novel mechanism to dynamically select Error-Correcting Codes (ECCs) of different strength and overheads for each allocated page of a program on main memories. Odd-ECC is applied to memory systems that use conventional 2D, as well as 3D stacked DRAMs and is evaluated using various applications. Our experiments show that compared to flat memory protection schemes, Odd-ECC reduces ECCs capacity overheads by up to 39% while achieving the same Mean Time to Failure (MTTF)

Chalmers Research

Chalmers Publication Library

RQNoC: A resilient quality-of-service network-on-chip with service redirection

Author: He Yifan
Malek Alirad
Rauwerda G.K.
Sourdis Ioannis
Tzilis Stavros
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

In this article, we describe RQNoC, a service-oriented Network-on-Chip (NoC) resilient to permanent faults. We characterize the network resources based on the particular service that they support and, when faulty, bypass them, allowing the respective traffic class to be redirected. We propose two alternatives for service redirection, each having different advantages and disadvantages. The first one, Service Detour, uses longer alternative paths through resources of the same service to bypass faulty network parts, keeping traffic classes isolated. The second approach, Service Merge, uses resources of other services providing shorter paths but allowing traffic classes to interfere with each other. The remaining network resources that are common for all services employ additional mechanisms for tolerating faults. Links tolerate faults using additional spare wires combined with a flit-shifting mechanism, and the router control is protected with Triple-Modular-Redundancy (TMR). The proposed RQNoC network designs are implemented in 65nm technology and evaluated in terms of performance, area, power consumption, and fault tolerance. Service Detour requires 9% more area and consumes 7.3% more power compared to a baseline network, not tolerant to faults. Its packet latency and throughput is close to the fault-free performance at low-fault densities, but fault tolerance and performance drop substantially for 8 or more network faults. Service Merge requires 22% more area and 27% more power than the baseline and has a 9% slower clock. Compared to a faultfree network, a Service Merge RQNoC with up to 32 faults has increased packet latency up to 1.5 to 2.4 7 and reduced throughput to 70% or 50%. However, it delivers substantially better fault tolerance, having a mean network connectivity above 90% even with 32 network faults versus 41% of a Service Detour network. Combining Serve Merge and Service Detour improves fault tolerance, further sustaining a higher number of network faults and reduced packet latency

Chalmers Research

RQNoC: A resilient quality-of-service network-on-chip with service redirection

Author: He Yifan
Malek Alirad
Rauwerda G.K.
Sourdis Ioannis
Tzilis Stavros
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Chalmers Research

Chalmers Publication Library

Highly Scalable Implementation of a Robust MMSE Channel Estimator for OFDM Multi-Standard Environment

Author: Diaz Isael
Foroughi Farzad
Malek Alirad
Rodrigues Joachim
Sathyanarayanan Balaji
Publication venue: IEEE - Institute of Electrical and Electronics Engineers Inc.
Publication date: 01/01/2011
Field of study

Abstract in Undetermined In this paper a VLSI implementation of a highly scalable MMSE (Minimum Mean Square Estimator) is presented with the ultimate goal of demonstrating the potential of MMSE as enabler for multi-standard channel estimation. By selecting an appropriate implementation, a complexity reduction of 98% is achieved when compared to Time-Domain Maximum Likelihood Estimation (TDMLE), whereas low power consumption is accomplished by implementing a low-power-mode. The architecture is capable of performing Least Square (LS) estimation and MMSE compliant with 3GPP LTE (Long Term Evolution), IEEE 802.11n (WLAN), and DVB-H (Digital Video Broadcast for Handheld Devices), The estimator is synthesized using a 65 nm low-leakage high-threshold standard-cell CMOS library. The design occupies an area of 0.169 mm(2), is capable of running upto 250 MHz, providing a throughput of 78 M estimates/second. Simulations under a typical LTE reception show that the implementation dissipates 4.9 mu W per sample

Lund University Publications

Odd-ECC: On-demand DRAM error correcting codes

Author: Malek Alirad
Papaefstathiou Vasileios
Petersen Moura Trancoso Pedro
Sourdis Ioannis
Vasilakis Evangelos
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

An application may have different sensitivity to faults in different subsets of the data it uses. Some data regions may therefore be more critical than others. Capitalizing on this observation, Odd-ECC provides a mechanism to dynamically select the memory fault tolerance of each allocated page of a program on demand depending on the criticality of the respective data. Odd-ECC error correcting codes (ECCs) are stored in separate physical pages and hidden by the OS as pages unavailable to the user. Still, these ECCs are physically aligned with the data they protect so the memory controller can efficiently access them. Thereby, capacity, performance and energy overheads of memory fault tolerance are proportional to the criticality of the data stored. Odd-ECC is applied to memory systems that use conventional 2D DRAM DIMMs as well as to 3D-stacked DRAMs and evaluated using various applications. Compared to flat memory protection schemes, Odd-ECC substantially reduces ECCs capacity overheads while achieving the same Mean Time to Failure (MTTF) and in addition it slightly improves performance and energy costs. Under the same capacity constraints, Odd-ECC achieves substantially higher MTTF, compared to a flat memory protection. This comes at a performance and energy cost, which is however still a fraction of the cost introduced by a flat equally strong scheme

Chalmers Research